- 
                Notifications
    You must be signed in to change notification settings 
- Fork 79
Implement algorithm for calculating species tree topology distributions. #650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement algorithm for calculating species tree topology distributions. #650
Conversation
1f9280c    to
    d0d4242      
    Compare
  
    | Codecov Report
 @@            Coverage Diff             @@
##           master     #650      +/-   ##
==========================================
+ Coverage   87.71%   87.77%   +0.05%     
==========================================
  Files          23       23              
  Lines       17360    17468     +108     
  Branches     3421     3456      +35     
==========================================
+ Hits        15227    15332     +105     
  Misses       1044     1044              
- Partials     1089     1092       +3     
 
 Continue to review full report at Codecov. 
 | 
| Codecov Report
 @@            Coverage Diff             @@
##           master     #650      +/-   ##
==========================================
+ Coverage   87.63%   87.74%   +0.10%     
==========================================
  Files          23       23              
  Lines       17523    17668     +145     
  Branches     3450     3498      +48     
==========================================
+ Hits        15357    15503     +146     
  Misses       1062     1062              
+ Partials     1104     1103       -1     
 
 Continue to review full report at Codecov. 
 | 
3fe1fde    to
    9d6cd23      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick feedback here @daniel-goldstein, I've not grokked how it fits together yet or the tests.
Should we follow the pattern of adding wrapper methods on Tree and TreeSequence (and move the docstrings over) to not have to expose the combinatorics module? If so, should we lift the functions over entirely or make the Tree{Sequence} methods one-liners?
Yes, let's one-liner methods to the Tree and TreeSequence classes.
9d6cd23    to
    fd6e6fe      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @daniel-goldstein! The main changes we need to make I think is on the interface, as we discussed.
fd6e6fe    to
    90cf5d0      
    Compare
  
    | @jeromekelleher Addressed the changes we discussed. Created a  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @daniel-goldstein - a few more minor comments.
90cf5d0    to
    192e238      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of more comments @daniel-goldstein.
@petrelharp, do you want to take a look through this, or should I merge? We want to start using this pretty soon, and I think the overall API is probably pretty solid (we should mark it as alpha anyway, so people don't have too high expectations of stability).
192e238    to
    be1bc0b      
    Compare
  
    | Looks good! I had some clarifying comments on the docs, and am happy to look again if you want, but don't have to. | 
| Thanks @petrelharp! | 
be1bc0b    to
    57dde5f      
    Compare
  
    | OK, great, I'm going to mark this for merging. @daniel-goldstein, can you open issues to track anything we've left dangling/unfinished here please? | 
| @Mergifyio refresh | 
| Travis AWOL again! Kicking it. | 
| Command  | 
57dde5f    to
    88efe1c      
    Compare
  
    
This implements both per-tree and incremental algorithms for computing exact tree topology distributions on tree sequences.
As discussed in my talk last week, this uses a DP approach to compute all possible species topology distributions at each node in the tree, starting from leaves up to roots. The
TopologyTreeclass is used to store topology distributions at each node. This approach does not support internal samples, and enforces that samples be a subset of leaves (leaves that are not samples are ignored). This does, nicely, support multiply-rooted trees. As there can't be species topologies that span different roots, we compute the topologies for each root usingTopologyTreeand "add" the resulting distributions together.Since this approach needs only a single upward traversal, the incremental algorithm is straightforward. We maintain a reference to the
TopologyTreefor each node and reconstruct thoseTopologyTrees on the upward traversals when inserting/removing edge diffs.Should we follow the pattern of adding wrapper methods on
TreeandTreeSequence(and move the docstrings over) to not have to expose the combinatorics module? If so, should we lift the functions over entirely or make theTree{Sequence}methods one-liners?cc @jeromekelleher @hyanwong